Valency Lexicon of Czech Verbs
نویسنده
چکیده
Valency is a property of language units reflecting their combinatorial potential in language utterances. The availability of the information about valency is supposed to be crucial in various Natural Language Processing tasks. In general, valency of language units cannot be automatically predicted, and therefore it has to be stored in a lexicon. The primary goal of the presented work is to create a both humanand machine-readable lexicon capturing valency of the most frequent Czech verbs. For this purpose, valency theory developed within Functional Generative Description (FGD) is used as the theoretical framework. The thesis consists of three major parts. The first part contains a survey of literature and language resources related to valency in Czech and other languages. Basic properties of as many as eighteen different language resources are mentioned in this part. In the second part, we gather the dispersed linguistic knowledge necessary for building valency lexicons. We demonstrate that if manifestations of valency are to be studied in detail, it is necessary to distinguish two levels of valency. We introduce a new terminology for describing such manifestations in dependency trees; special attention is paid to coordination structures. We also preliminarily propose the alternation-based lexicon model, which is novel in the context of FGD and the main goal of which is to reduce the lexicon redundancy. The third part of the thesis deals with the newly created valency lexicon of the most frequent Czech verbs. The lexicon is called VALLEX and its latest version contains around 1600 verb lexemes (corresponding to roughly 1800 morphological lemmas); valency frames of around 4400 lexical units (corresponding to the individual senses of the lexemes) are stored in the lexicon. The main software components of the dictionary production system developed for VALLEX are outlined, and selected quantitative properties of the current version of the lexicon are discussed.
منابع مشابه
Valency Lexicon of Czech Verbs: Towards Formal Description of Valency and Its Modeling in an Electronic Language Resource
Valency refers to the capacity of verb (or a word belonging to another part of speech) to take a specific number and type of syntactically dependent language units. Valency information is thus related to particular lexemes and as such it is necessary to describe valency characteristics for separate lexemes in the form of lexicon entries. A valency lexicon is indispensable for any complex Natura...
متن کاملReflexive Verbs in a Valency Lexicon: The Case of Czech Reflexive Morphemes
In this paper, we deal with Czech reflexive verbs from the lexicographic point of view. We show that the Czech reflexive morphemes se and si constitute different linguistic meanings: either they are formal means of the word formation process of the so called reflexivization, or they are associated with the syntactic phenomena of reflexivity, reciprocity, and diatheses. All of these processes ar...
متن کاملValency Lexicon of Czech Verbs: Alternation-Based Model
The main objective of this paper is to introduce an alternation-based model of valency lexicon of Czech verbs VALLEX. Alternations describe regular changes in valency structure of verbs – they are seen as transformations taking one lexical unit and return a modified lexical unit as a result. We characterize and exemplify ‘syntactically-based’ and ‘semantically-based’ alternations and their effe...
متن کاملValency Lexicon for Czech: From Verbs to Nouns
Valency lexicon of Czech verbs has been intensively worked on for more than a year, and now we have at our disposal a detailed description of valency frames of several hundreds verbs. Presently, the challenge naturally arises, to use the existing lexicon for capturing valency of other word classes. In this paper, we focus on valency of nouns derived from verbs. We propose an algorithm for autom...
متن کاملChanges in Valency Structure of Verbs: Grammar vs. Lexicon
In this paper, we deal with changes in valency structure of Czech verbs from a lexicographic point of view. We focus only on syntactic constructions that are related in principle to the same (generalized) situation. Changes in valency structure are understood as different mappings between individual participants of a generalized situation and valency slots, including their morphemic realization...
متن کاملSemantic Roles in Valency Lexicon of Czech Verbs: Verbs of Communication and Exchange
We introduce a project to enhance a valency lexicon of Czech verbs with semantic roles. For this purpose, we make use of FrameNet. At the present stage, frame elements from FrameNet have been mapped to valency complementations of verbs of communication and verbs of exchange. The feasibility of this task has been proven by the achieved inter-annotator agreement – 95.6% for the verbs of communica...
متن کامل